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ABSTRACT 


Consistent and well-defined criteria for the classification and measurement of 
humpback whale song features are essential for robust comparisons between investi- 
gators. Song structure terminology has been well-established and used by many 
authors, though at times inconsistently. This review discusses the development of 
the nomenclature describing humpback song and explores the potential significance 
of the often-overlooked variation in song patterns. Within the hierarchical definition 
of humpback song, the most problematic issues arise from the inconsistent delinea- 
tion of phrase types, and the use of the metric of song duration without regards to 
variability in thematic sequence. With regards to the former, a set of guidelines is 
suggested to facilitate consistent delineation of phrases. With regards to the latter, 
current research demonstrates that the “song duration” metric has resulted in the dis- 
regard of variability at this level, which is more widespread than traditionally 
reported. An exemplar case is used to highlight the problem inherent in defining 
and measuring song duration. Humpback song is evaluated within the framework of 
avian songbird research, and a shift in analysis paradigm is recommended, towards 
phrase-based analyses in which sequences of phrases are treated as a salient feature of 
song pattern. 


Key words: humpback whale, Megaptera novaeangliae, song structure, song classifica- 
tion, eventual variety, avian song. 


Variation in a behavioral trait provides the opportunity for selection and evolution, 
and therefore is of interest to behavioral ecologists. Detailed qualitative description of 
a species’ behavior, such as their sound repertoire, is important. Nevertheless, the 
questions that drive the advancement of knowledge about a species’ communication 
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system are often answered by quantitative analyses of the variation in specific traits. 
Here we review and discuss the development of the nomenclature describing hump- 
back song and explore the potential significance of the often-overlooked variation in 
song patterns. 


A SHORT HISTORY OF THE STUDY OF HUMPBACK WHALE SONG 


As recently as the 1940s, scientists did not know whether baleen whales produced 
sounds. Although cetacean sounds were apparently well known to historic whalers 
(Aldrich 1889), the early scientific community concluded that cetaceans were mute 
based upon the discovery that they did not possess vocal chords (Schevill and 
Watkins 1962). It was not until after the confirmation of sound production by odont- 
ocetes (Fraser 1947, Kullenberg 1947, Schevill and Lawrence 1949) that scientists 
began to realize that mysticetes also produced vocalizations. 

The first documentation of humpback vocalizations within the scientific commu- 
nity was presented in 1952, when Schreiber (1952) recorded sounds with a “musical 
quality” off the island of Oahu, which he attributed to marine life. We now know 
that these musical sounds were produced by humpback whales. A decade of research 
after Schreiber’s discovery, uncovered the fact that humpbacks produce a wide range 
of vocalizations on the breeding grounds and en route to them, as well as on the feed- 
ing grounds. By 1964 their breeding sounds were becoming well known: “...the 
sonorous moans and screams associated with the migrations of Megaptera past 
Bermuda and Hawaii may be an audible manifestation of more fundamental vernal 
urges...” (Schevill 1964). Off the coast of New Zealand, acousticians were making 
similar discoveries, although they did not positively identify the sounds as humpback 
in origin. They described how a “...chorus of squeals, creaks, cries, barks, groans, and 
whoops...” had been labeled the “Barnyard Chorus” by the laboratory staff (Kibble- 
white et al. 1967). The same author even supposed that one individual might produce 
all the different sounds in this chorus, though he did not attempt to confirm this. 
Ironically, these New Zealand researchers may have documented the (near) extirpa- 
tion of a humpback whale breeding population through acoustic monitoring without 
realizing it; they described the decline in these sounds from 1960 to 1963, and stated 
that no positive instances were recorded after 1963. Around the same time, a spectro- 
graphic catalog of some of the sounds known to be produced by humpback whales 
demonstrated the diversity of these vocalizations (Tavolga 1968). In 1970, in a tech- 
nical report written for the Naval Undersea Research and Development Center, 
Cummings and Philippi (1970) describe repetitive “stanzas” recorded in late Decem- 
ber in the northwest Atlantic. Their low sampling rate precluded the detection of any 
sounds above 175 Hz, yet the authors were able to identify sound series that lasted 
11-14 min, including pulses, blips, and moans. They tentatively identified these 
sounds as being North Atlantic right whale (Eubalaena glacialis) vocalizations; in all 
likelihood, they were actually listening to humpback whales (Payne and Payne 
1971). 

It seems that scientists working in every ocean basin were on the verge of a new 
discovery, but it was not until 1969 that these sounds were finally reported by Roger 
and Katy Payne as the “song” produced by humpback whales (Anonymous 1969), 
followed shortly thereafter by a conference presentation by Howard Winn (Winn 
et al. 1970). In 1971 Payne and McVay published a pivotal paper describing the pat- 
terned, hierarchical structure of these sounds, making the first connection with bird 
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song. These authors relied on Broughton’s (1963) definitions of the term song, 
including: “... a series of notes, generally of more than one type, uttered in succession 
and so related as to form a recognizable sequence or pattern in time,” recognizing that 
what humpbacks were producing fit into this framework. Winn and Winn (1978) 
supported this discovery with a separate description of the patterned sequences of 
sounds recorded from humpback whales in Bermuda and the West Indies, and a 
slightly different approach towards characterization of song structure than Payne and 
McVay (1971). 

From that time on, investigations regarding the complexity, function, and pattern 
in humpback whale song have increasingly grown in number. In 1979 the first paper 
attempting to quantify signature information in song units was published (Hafner 
et al. 1979). The authors did not recognize the significance of the evolving nature of 
humpback song at this time, and thus variation due to temporal changes was 
confounded with individual variation. Temporal variation, or rapid cultural evolu- 
tion, was soon to be described in detail (Guinee et al. 1983, Payne et al. 1983, Payne 
and Payne 1985). Singing humpback whales were shown to be males (Glockner 
1983), and the function of song within the breeding season was hypothesized to play 
a role in female attraction (Winn and Winn 1978, Herman and Tavolga 1980, Tyack 
1981) or mediate male-male interactions (Darling 1983, Darling et al. 2006, Chole- 
wiak 2008). Further studies have compared song patterns across regions (¢.g., Payne 
and Guinee 1983, Helweg et al. 1990, Cerchio et al. 2001, Darling and Sousa-Lima 
2005, Garland eż al. 2011), or seasons (Winn and Winn 1978; Matilla et al. 1987; 
Noad et al. 2000; Eriksen et al. 2005; Mercado et al. 2003, 2005). This work has 
repeatedly demonstrated four universally observed features of humpback song: (1) 
populations or groups of males that are in acoustic contact at some point in time and 
space sing “similar” songs, which is to say that their songs are comprised of visually 
and aurally recognizable similarities in pattern that are characteristic of that time/ 
region; (2) the overall hierarchical structure is observed globally, thus a heritable spe- 
cies-level characteristic, although the details of song patterns differ between popula- 
tions of males that are acoustically isolated during all seasons; (3) song patterns 
change over time as a result of individual males modifying the spectral and temporal 
features of song units, as well as their order and repetition; and (4) males in acoustic 
contact incorporate similar changes into their own songs, maintaining continuity 
within populations despite progressive temporal changes. 

Between the 1970s and early 1980s, the terminology used to describe humpback 
whale songs was firmly cemented within the scientific literature (Payne and McVay 
1971, Payne et al. 1983). Other authors occasionally varied this vocabulary (Winn 
and Winn 1978), but for the most part it has remained in use as originally proposed 
by Payne and McVay. This terminology is not, however, without complications, and 
inconsistencies in its application have led to incongruities in the literature (for 
example, compare Thompson and Friedl 1982 with McSweeney eż al. 1989). 

Despite the consistency of song patterns exhibited among males within a breeding 
population, researchers have noted a fair amount of variation in humpback whale 
song, both within the songs of an individual and between the songs of different indi- 
viduals (first noted by Payne and McVay 1971). Early on, Frumhoff (1983) conducted 
an extensive review of “anomalous” songs, and while early analyses suggested that 
song sequence was extremely stereotyped (Payne and McVay 1971, Winn and Winn 
1978, Payne et al. 1983), later studies have demonstrated that song structure is not 
always as consistent as was first reported (Helweg et al. 1990, 1992, 1998; Eriksen 
et al. 2005). Measurements of variation both on the overall pattern level, as well as on 


REVIEW ARTICLE E315 





the level of individual song units, are clearly important for understanding the 
function of song within this species and the influence of sexual selection on singing 
behavior. 

Recently, there has been an increase in attempts to develop automated procedures 
for the classification of individual song elements, appearing in both refereed 
(Rickwood and Taylor 2008, Pace et al. 2010, Green eż al. 2011) and nonrefereed 
literature (Mazhar eż al. 2008, Picot et al. 2008). These procedures range from semi- 
automated (requiring manual selection of song elements) to fully automated 
(including both the detection of and classification of song elements). In general, these 
methodologies are largely still under development and demonstrations of their success 
are as yet limited. Moreover, for studies of song evolution (rather than song element 
classification), the variation exhibited at individual, population, and temporal levels 
will likely present a significant problem for automated classifiers to recognize similar 
elements as they undergo progressive change over time. Studies assessing song over 
protracted geographic or temporal scales require the accurate classification and group- 
ing of similar song elements in order to draw valid conclusions regarding song 
transmission and behavioral processes, and this is often confounded by sparse 
sampling. There is clearly a need for the development of automated classification sys- 
tems in order to handle large data sets of recordings; however, no system has yet been 
demonstrated which compares with the pattern recognition skills of the human brain. 

Consistent and well-defined criteria for the classification and measurement of 
humpback whale song features are essential for robust comparisons between investi- 
gators. In humpback song analyses, arbitrary divisions are imposed on what is other- 
wise a continuous vocal sequence. At times, these divisions are constructed 
inconsistently. Payne and McVay’s (1971) terminology has been established and used 
by many authors, even though no practical rules were provided to guide the delinea- 
tion of boundaries between different types of units, phrases, or themes. This lack of 
specific guidance has contributed to differences among authors in adopting criteria 
used to partition continuous humpback whale singing sessions. Additionally, in light 
of ongoing research, it has become clear that the rigid structure suggested by Payne 
and McVay (1971) is not universally applicable to all songs. 

The objectives of this review are to (1) suggest consensus criteria for song structure 
definitions and discrimination and (2) engage in a discussion on the use of song cycle 
metrics in the humpback system, with suggestions for revision of the traditional anal- 
ysis paradigm. While it may not be possible to develop rules that will work univer- 
sally, we believe that establishing guidelines will be constructive, especially as this 
field continues to expand rapidly. These suggestions are based upon our combined 
experience with humpback whale song spanning seven different regions in four ocean 
basins with recordings collected over three decades. Our aim is to better enable com- 
parable quantitative analyses between investigators. However, we are aware that our 
suggestions may not be suitable for everyone, therefore we urge future authors to 
explicitly and thoroughly describe the way they generate the humpback whale song 
classifications and metrics in their publications. 


REVIEW OF SONG STRUCTURE DEFINITIONS AND CRITERIA 


The framework by which humpback whale song is defined and categorized is essen- 
tially a model, used to make inference about the singing behavior of humpback 
whales. As famously noted by Box and Draper (1987), all models are by nature an 
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approximation, and the best models approximate closely enough to be useful in inference; 
importantly, even the most useful of models should regularly be revisited when new 
data become available to improve upon them. Payne and McVay (1971), in an effort 
to describe their observation of predictable, repetitive vocal cycles, described a hierar- 
chical structure that has been widely adopted and utilized by researchers for over 
30 yr. The shortest sound is called a unit, which may be divided into subunits when 
comprised of pulses that are too rapid to be individually discriminated at real speed. 
A set of units is combined to form a phrase. Similar phrases are repeated to form a 
theme. The song is defined as the combination of multiple distinct themes. A song 
session consists of a series of repeated songs with silent intervals of less than a minute, 
which are typically not discernibly longer than those intervals between phrases or 
themes. Our aim is to assess this framework, identify where it has been useful, and 
suggest revision where it may be improved. 


Subunits and Units 


The term subunit is one that has multiple uses within the literature. It was origi- 
nally defined (Payne and McVay 1971) as a component of a sound that is aurally 
indistinguishable as a discrete vocalization. When examined with appropriate spec- 
trogram parameters or played at a slower rate, seemingly continuous sounds may be 
shown to be composed of discrete pulses (Fig. 1). Grating or rasping sounds may be 
considered a single unit, but are actually pulse-trains that are made up of a series of 
pulses, or subunits. It is important to note that in these cases, spectrogram parameters 
(such as the number of points in a Fast Fourier Transform (FFT)) will determine the 
resolution of subunits. 

More recently, the term subunit has been adopted primarily to refer to the com- 
ponents of a sound distinguished by frequency discontinuities or inflection points. 
Effort is now being directed towards evaluating whether these sound components 
can be useful for automatic classification (Pace et al. 2010). This effort reflects the 
progress within the broader field of automated call classification, in which similar 
analyses are being conducted on odontocete whistles (Shapiro et al. 2011). 

A unit is defined (Payne and McVay 1971) as the shortest sound that seems contin- 
uous when evaluated at real speed (Fig. 2). This is analogous to a “note” or “element” 
in the avian song literature (Issac and Marler 1963). Winn and Winn (1978) called 
these individual sounds “syllables,” which is somewhat confusing, as “syllables” in 
the avian literature may actually be composed from groups of notes that have inter- 
note intervals of silence that are shorter than the duration of the adjacent notes (Isaac 
and Marler 1963). 

We suggest that the hierarchical levels of unit and subunit be maintained. These 
definitions are simple and unambiguous, allowing robust comparisons. The term unit 
has been more widely used within the humpback literature, and should be used in 
place of the term syllable. 


Subphrases and Phrases 


A subphrase (Payne et al. 1983) is a sequence of one or more units that is sometimes 
repeated in a series (Fig. 2). These units of repetition were called “motifs” by Winn 
and Winn (1978), who divided them into two types: similar (containing only one 
type of unit, repeated several times) or dissimilar (containing two or more different 
units, which are repeated in combination several times). 
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Figure 1. Two different examples of a humpback song unit comprised of subunits, recorded 
at Isla Socorro, Mexico. The top panel in each column displays the amplitude envelope, and 
the lower panels display the spectrogram (Hann window, 50% overlap) or part thereof, show- 
ing the unit and subunit structure. (A) Recorded 27 March 2006. The middle segment of this 
song unit is composed of a pulse train of subunits. Top panel: 1024 pt FFT (Fast Fourier 
Transform); middle panel: 256 pt FFT, bottom panel: 256 pt FFT. (B) Recorded 8 April 
2004. Typical “ratchet” sound, in which the entire unit is composed of rapidly produced 
pulses. Top panel: 1,024 pt FFT; middle panel 256 pt FFT, bottom panel: 128 pt FFT. Note 
that the full frequency range of these example units exceeds the frequency range chosen for 
their display in this figure. 


Multiple subphrases are grouped into a phrase (Payne and McVay 1971). Similar 
phrases are generally repeated from a few to many times before a different phrase type 
is introduced (Fig. 2). However, consecutive phrases may have different numbers of 
units, as well as a change in the spectral and/or temporal characteristics of the units, 
while still being identifiable as belonging to the same repetitive sequence. As noted 
by Payne and McVay (1971), phrases are “inexact replicas” of one another. We 
suggest that the phrase may be considered the salient element of repetition within 
humpback song. Differing from Payne and McVay’s original interpretation (1971), 
we propose that the phrase hierarchical level is most analogous to the structural level 
of “song” in the avian literature (see below for further discussion on this topic). 

Unfortunately, the delineation of phrases may be difficult and ambiguous. Typi- 
cally, the intervals between successive phrases are the same as the intervals between 
units within a phrase (in contrast to most bird song, where intersong intervals are 
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Figure 2. Spectrographic representation of humpback whale song sequence, recorded at Isla 
Socorro, Mexico, on 27 March 2006 (1,024 pt FFT [Fast Fourier Transform], Hann window, 
50% overlap). Time on the x-axis is in minutes:seconds, while frequency on the y-axis is in 
kHz. (A) One phrase consisting of two subphrases. The first subphrase is composed of one unit, 
while the second subphrase is composed of a repeating pattern of 2—3 individual units. Note 
that units 3, 5, 6, and 8 within the example phrase are continuous, although their full fre- 
quency range exceeds that chosen for display in this figure. (B) 155 s sequence of song, in 
which multiple phrases types can be observed. Phrases have been delineated by vertical lines. 


noticeably longer than internote intervals [Isaac and Marler 1963]). The delineation 
of phrase structure is subjective, such that a choice needs to be made about where 
within a sequence of units one will start a phrase, and at what point variation in 
phrase structure is significant enough to delineate a new phrase type. 

Therefore, differences in methodology between authors make it challenging to 
compare studies, even of song within the same region. For example, two authors ana- 
lyzing song recorded off Hawaii in 1979 divided the same sequence into four types of 
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phrases (Thompson and Fried! 1982) and seven types of phrases (McSweeney et al. 
1989). Irregularities such as these muddle the comparisons. 

However, despite challenges in labeling and delineating phrases, multiple authors 
have recognized that phrase duration is one of the most stable features of humpback 
whale song, with very low coefficients of variation within and between individuals 
(Frumhoff 1983, Payne et al. 1983, Cerchio 1993, Cerchio eż al. 2001). This strongly 
suggests that there is significance to the phrase as an element that is important to 
humpback whales, and delineation of phrases is not an arbitrary construct of research- 
ers attempting to force organization upon the display. The same does not appear to 
be true of the “song” measurement that will be detailed below. 

In an attempt to partly remedy variability in methodology, we suggest the follow- 
ing simple guidelines for delineating and measuring phrases, which could be adopted 
in any song: 


(1) Consecutive units of similar structure should not be separated within a phrase, 
but should be kept together as parts of a subphrase. 

(2) Phrases should be delineated in a way that minimizes the occurrence of an 
incomplete phrase at the end of a sequence of similar phrases (also called “hang- 
ing” phrases, consisting of only a portion of the repetitive structure, such as one 
subphrase). 

(3) “Transitional” phrases combine units from two different phrase types (Payne and 
Payne 1985), usually an entire subphrase from the previous and subsequent 
themes. For example, using letters to indicate subphrases, in the phrase sequence: 


ab ab ab ad cd cd cd 


ad is a transitional phrase, composed of subphrase “a” and subphrase “d.” These 
phrases should be identified as such, and not mistaken for new phrase types. 

(4) Care should be taken to recognize inherent variation within phase types, and 
thereby to distinguish between patterns that are variants of a single phrase type 
vs. completely different phrase types. Variation within a phrase type may involve 
differences in structure or repetition of units, without a consequent shift in the 
overall pattern of the subphrase or phrase; conversely, when the pattern, composi- 
tion and/or number of units in a phrase dramatically differs, and is maintained 
within the sequence, a completely different phrase type should be defined. This is 
not necessarily to advocate lumping over splitting, but rather to define different 
levels of variation. As this process is unavoidably qualitative and somewhat sub- 
jective, exemplar spectrograms illustrating different phrases and variants should 
always be presented in publications, and authors (and editors) should avoid 
reporting phrase classification only nominally without supporting figures. 

(5) Duration of phrases should be measured including the interval between phrases 
(i.e., measuring from the onset of the first unit in one phrase to the onset of the 
analogous unit in the subsequent phrase). Measurements made in this way will 
be robust regardless of how one chooses to delineate phrases. In fact, if measured 
consistently, phrase duration has the least variation within and between singers 
of a particular population (Frumhoff 1983, Cerchio et al. 2001). 

(6) A review of song based on recordings of multiple individuals is essential for 
appropriately assigning phrase presence and structure; song structure should 
never be delineated based on recordings from a single individual, and only cau- 
tiously for small samples of individuals. When there is ambiguity in the assignment 
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of units to phrases, we urge authors to search for consistency in interindividual 
pattern or consistency in subphrase structure. 


We suggest that the hierarchical level of the phrase be maintained, with closer 
attention being given to criteria used to delineate phrases. In addition, the use of 
recordings that provide adequate signal-to-noise ratio and encompass the entire 
fundamental frequency bandwidth of the song structure is important, as failure to do 
so will preclude the detection of some units and may lead to misinterpretation of 
song structure. When possible, a multiyear review of song, composed of multiple 
individuals from each year, should be conducted to examine phrase pattern and evolu- 
tion, which may aid in classification of phrase organization. Although we will not 
attempt to evaluate or provide guidelines on appropriate sampling protocols, 
sampling is as critical with humpback whale song as it is with any other data type 
(e.g., genetic sampling). It is important to optimize both quality of samples (signal- 
to-noise ratio and recording length) and sample size (number of different individuals 
recorded) to accurately document population level parameters. 


Themes 


A sequence of similar phrases is defined as a theme (Payne and McVay 1971), and 
therefore a new phrase type within the song sequence initiates a new theme. Individ- 
ual males may sing different numbers of phrases, both in different themes and in con- 
secutive renditions of the same theme. Thus, the length of any given theme varies 
both within and between individuals. Frumhoff (1983) classified fundamental themes 
as those present in all songs in at least 90% of the recordings in both a given season 
and at least one contiguous season. This concept was driven by the idea that thematic 
order is fixed (and “aberrant” songs strayed from that order when “fundamental” 
themes were omitted). Other authors have classified “fundamental” themes as those 
present in 95% of all recordings within one season (Chu and Harcourt 1986). Due to 
the evolving nature of humpback song, the concept of a fundamental theme is 
ephemeral. 

Payne and Payne (1985) further classified themes into three different types based 
on their organization: (1) Static themes are those with a sequence of nearly identical 
phrases. (2) Shifting themes are those in which successive phrases evolve progressively 
from one form to another. Units may gradually change in frequency and/or form, 
duration, or number of subunits, or be delivered at a slower or faster rate (ż.e., varia- 
tion in interunit interval). Phrases may evolve such that changes are progressive, sys- 
tematic and irreversible with each successive repetition. (3) Unpatterned themes are 
those in which a variable number of units have no clear organization and thus cannot 
be subdivided into repeating phrases. The result is a theme composed of one single, 
long phrase. (Another type of one-phrase theme noted by Payne and McVay (1971) 
occurs when a single phrase is composed of unique material that does not resemble 
the previous or next theme but yet occurs consistently.) It should be noted that un- 
patterned themes appear to be rare, occurring only twice in the cumulative experience 
of the authors (DMC, RSS-L and SC, unpublished data). 

The classification of fundamental themes has not been widely adopted in the litera- 
ture. The categorization of themes into types has also not been widespread, but Payne 
and Payne’s (1985) proposition of static, shifting, and unpatterned themes does 
characterize humpback whale singing behavior across many regions and years (DMC, 
RSS-L and SC, unpublished data). 
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Because song is constantly undergoing progressive temporal change, the processes 
by which changes occur can have a profound effect on the structure and classification 
of song elements. An exploration of all such processes is beyond the scope of this 
paper, but examples reported in the literature exemplify the phenomenon and under- 
score the need for consideration of progressive change during classification. Several 
studies have documented gradual change within and between years (Payne et al. 
1983; Cerchio et al. 2001; Eriksen et al. 2005) and have endeavored to maintain the 
nomenclature for phrases and themes across time despite changes. We emphasize that 
it is important to maintain the integrity of phrase/theme “lineages” when conducting 
studies of similarity across time, so as not to confuse the evolution of existing phrases 
with the introduction of new material (completely different phrases). 

Conversely, in some cases, different variants of the same phrase type (as defined 
above) that are present in the same song and sung in an inconsistent and interchange- 
able sequence may be indicative of the evolution of a new “theme” (Payne et al. 
1983). Cerchio (1993) and Cerchio eż al. (2001) noted that such instances represented 
the splitting of a theme and “birth” of a new theme over multiple seasons. In the 
schematic example represented in Table 1, two variants of the same phrase existed in 
year 1. They shared their second subphrase (d), but their first subphrase represented 
variants (a or b). They were uttered interchangeably and were considered part of the 
same theme. In year 2, each variant had evolved and diverged, but subphrase “d” was 
still similar between the two. The phrase types were no longer sung interchangeably, 
but had become clearly sequential as typical of consecutive static themes. At this 
point, they appeared to be treated as two distinct themes. By year 3, subphrase “d” 
had been dropped completely from the second phrase type, and the two phrases (and 
consequently, themes) no longer showed any similarity to one another. In this case, 
the change in phrase structure occurred at the subphrase level. In an analysis of song 
pattern, Cerchio eż al. (2001) designated themes in such a way as to follow the chang- 
ing pattern in subphrase modifications. This convention may reveal rules of change 
over extended periods of time and extensive geographic scales. 

We suggest that the definition of a theme be clarified to emphasize subphrase 
structure, such that a theme is a repetition of phrases that have similar subphrases in 
common. Between two phrase types, when one subphrase is similar but another sub- 
phrase is consistently different (called “rhyming” phrases in Guinee and Payne 1988), 
a new theme should be designated based on the sequence consistency of the structural 
change. 

Care should be taken in classifying themes when comparing songs from different 
regions or time periods. When multiyear samples are available, continuity in naming 


Table 1. Example of structural changes occurring at the subphrase level, resulting in the 
splitting of a single theme into two themes over a 3 yr period. In this schematic representa- 
tion, subphrase type is indicated by a lower case letter (a, b, or d) and temporal change in the 
internal structure of the subphrase across seasons is indicated by a prime (a > a’ —> a”). This 
process was observed in two sets of themes reported in Cerchio et al. (2001), themes 2A/2B 
and 3A/3B leading up to the year that was analyzed in that study. 











Year Phrase type 1 Phrase type 2 Example sequence 
1 ad bd ...cad-ad-ad-bd-bd-ad-ad-bd-... 


2 ad bid’ ..-a'd’-a'd’-a'd’- b’b’b'd'-b’b’b'd’-... 
3 ad” b” ...-a”d”-a”d”-a”d”-b”-b”-b”-b”-... 
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among years should be maintained, so that an individual “lineage” of a theme/phrase 
type may be followed over time. Similar themes in different adjacent regions (ż.e., 
where the potential for acoustic interaction exists) should be labeled as such, regard- 
less of their position within the song sequence or the presence or absence of other 
themes. In one study (Maeda eż al. 2000), what appeared to be similar themes in two 
different regions were mistakenly treated as different themes; the analyses of theme 
presence and evolution were therefore incomparable between the two regions, compli- 
cating interpretation of the results. 


Song 


A sequence of themes comprises a song, according to the definition proposed by Pay- 
ne and McVay (1971). This description was further developed by Frumhoff (1983), 
who described a song as a series of at least three themes, organized in a predictable 
sequence, repeated in the same order two or more times. The choice of the theme that 
starts the “song” cycle is considered arbitrary, since males usually sing in a continuous 
bout without stopping between repeated cycles (¢.g., in one famous example, Winn 
and Winn (1978) report having recorded a humpback whale singing for 22 h). The 
use of this definition of song will be discussed in detail in the following section. 


DISCUSSION OF THE HIERARCHICAL LEVEL OF SONG IN HUMPBACK WHALES 
Variation in Sequence Consistency 


Early studies suggested that humpback song patterns were produced in a fixed and 
ordered sequence (z.e., a male would sing themes 1 to 2 to 3 and then repeat the same 
sequence). This observation led to the conclusion that a hierarchical structure existed 
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Figure 3. Spectrograms of individual humpback whale song recorded in (A) Brazil and (B) 
Mexico. The y-axis displays frequency in Hz and the x-axis displays time elapsed (minutes:sec- 
onds). (A) Spectrogram of 3500s of song recorded off the coast of Brazil in 2005 (256 pt FFT 
[Fast Fourier Transform] size, Hann window). Note the cyclical pattern of the signal in the 
Brazil sample. (B) Spectrogram of 4200s of song recorded from Isla Socorro, Mexico in 2004 
(256 pt FFT, Hann window, 50% overlap). The traditional “cyclical” pattern is not observed 
in this example. 
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Figure 4. Transition diagrams showing the thematic sequence for multiple singers in two 
different years, demonstrating (A) “highly invariant” and (B) “highly variable” theme order, 
respectively. The numbers along each line indicate the number and proportion of total transi- 
tions from one theme to another. Note that transitions between phrases within each theme are 
not represented here. (A) Continuous recordings of five males from Kauai, Hawaii, in 1991 
were combined for approximately 7 h of theme sequence analysis. Themes are labeled by 
sequential numbers and letters. There were a total of 197 theme transitions; thematic sequence 
may be considered “highly invariant,” as 96% of thematic transitions follow the predicted 
order. (B) Continuous recordings of five males from Isla Socorro, Mexico in 2004 were com- 
bined for 5.8 h of theme sequence analysis. Themes were designated by letters, to avoid any 
unintentional bias associated with designating themes by sequential numbers. There were a 
total of 150 theme transitions; thematic sequence was quite variable in this year. Some theme 
transitions were extremely rare or nonexistent, while others were prominent. However, a clear 
cyclical pattern is not observed, as reversals between themes B & G, O & B, and B & Y are 
common. 


at this level that was extremely stereotyped and rigid (Payne and McVay 1971; Winn 
and Winn 1978; Winn e¢ al. 1981; Payne and Payne 1985; Guinee and Payne 1988). 

It is undeniable that there is an overall cyclical pattern at this level of song organi- 
zation (see Fig. 3A, Brazil and Fig. 4A, Hawaii). However, detailed reviews of the 
literature reveal that the stereotypy of theme order may vary considerably among and 
within individuals, or between years. Frumhoff (1983) coined the term “aberrant 
song session” as one in which a song cycle varied from the “current” norm; specifi- 
cally, when two themes that are usually separated by a fundamental theme are sung 
in succession. In general, this was considered uncommon. 

Later work, however, suggested a higher degree of variation in theme order. In a 
small sample of songs recorded in Mexico and Hawaii during 1989-1990, Helweg 
et al. (1990, 1992) reported that the sequence of themes in the songs they recorded 
was variable (including some theme “reversals”), and suggested that all of their 
samples would be described as “aberrant” by Frumhoff’s classification. Cato (1991) 
mentioned the existence of “poorly structured” songs in Australia in some years, and 
in a separate study of South Pacific song (Helweg et al. 1998), theme deletions were 
observed, but theme reversals were not. Additionally, in a multiyear comparison of 
song evolution in Tonga, Eriksen eż al. (2005) reported that the order of theme 
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transitions was not consistent from year to year. Although they do not provide data 
to quantify the number of seemingly “aberrant” song cycles, they graphically demon- 
strate that the song sequence in particular years of their sample is more variable than 
others, including a higher degree of theme “reversals” in certain years. 

A multiyear study of humpback whale breeding behavior conducted off of Isla 
Socorro, Mexico, has also resulted in the compilation of many hours of recordings of 
singing males across years. Recordings of multiple males in 2004 suggest greater var- 
iability in theme order than typically expected, or documented in adjacent years 
(DMC, unpublished data; Smith-Aguilar 2009). Rather than singing each theme in 
an ordered sequence, males recorded in this period often switch back and forth 
between themes, including a higher degree of “theme reversals” than previously 
reported (Fig. 4). 

In this situation, defining what constitutes a “song” by the classic definition is 
extremely difficult. There appears to be no clear, overarching sequence that all males 
in that region and time period followed while singing. This is not to say that the 
sequence of themes is random—in fact, examination of the transitions between 
themes reveals that some transitions are common (¢.g., the transition from theme 
B — G, or theme O —> B; Fig. 4), while others are absent (¢.g., there is never a transi- 
tion from theme B —> N; Fig. 4). The difference in the loosely structured pattern of 
songs from males in the Mexico sample, and those singing a more traditionally 
ordered song, is both visually evident (Fig. 3) and quantifiable (Fig. 4). 

Without more detailed study, it is impossible to say whether the variability in 
theme order observed in the songs of males recorded in Mexico 2004 is a short-term 
phenomenon or whether it is indicative of a larger-scale process. In an extensive, 
long-term study conducted on song evolution over many years, Payne and Payne 
(1985) found that occasionally, the song in a particular year seemed anomalous when 
compared to previous and subsequent seasons. Similarly, Cato (1991) also found that 
in some years, song was more poorly structured than others. Possibly, the variability 
observed in the Mexico 2004 sample can be attributed to a period of rapid song evo- 
lution, or it may reflect a broader cycle of the ebb and flow of stereotypy across years. 
However, regardless of the underlying reason for this variability, its existence (here 
and in other samples, e.g., Eriksen eż al. 2005) suggests that, in at least some cases, a 
different method of considering humpback whale song may be appropriate. 


Humpbacks as “Eventual Variety” Singers: An Avian Comparison 


There exists terminology in the avian literature that seems to effectively describe 
the humpback song system. Scholars of avian behavior defined bird song as a series of 
notes, uttered in such a way as to form a recognizable sequence (Thorpe 1961; 
Bremond 1963), in which the pauses between notes were shorter than the pauses 
between songs (Isaac and Marler 1963, Thielcke 1969). Variations on these patterns 
are called song types. Payne and McVay (1971) drew on this literature when develop- 
ing their terminology, but equated the entire sequence of themes sung by a hump- 
back whale to a song sung by a bird. 

However, we suggest that humpback phrases and bird songs are analogous. Early 
scholars of avian song noted different modes of singing behavior, in which an individ- 
ual may repeat the same song type many times before switching to a different song 
type (¢.g., AAA BBB CCC), or switch immediately between different song types in 
his repertoire with each consecutive song (¢.g., ABCABCABC, Hartshorne 1956). 
The difference between these modes of singing is often distinguished in the literature 
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as “eventual variety” or “repeat” singing, and “immediate variety” or “serial” singing 
(Molles and Vehrencamp 1999). These terms may also be used at lower levels of song 
organization (for example to describe repetition of syllables within songs), but we 
refer here to their use in describing higher patterns of song presentation. Within the 
“eventual variety” category, variation may exist between repeats of the same song 
type, but this variation is less than that between song types (Stoddard eż al. 1988, 
Anderson eż al. 2008). 

Not all species fit neatly into one category or the other, in fact there are a wide 
range of song presentation strategies. Some species may not consistently repeat all 
song types, and the degree of repetition may vary greatly between individuals (Eens 
et al. 1989). Other species exhibit complex rules for deciding when to repeat song 
types (Todt and Hultsch 1996). Still others sing extremely long, complex songs, in 
which even discerning the pattern is challenging, requiring detailed review of syllabic 
organization (Catchpole 1976). 

However, within the broad categories of singing modes, the repeated phrases of 
humpback song, especially those in “static” themes, would seem to fit well into the 
“eventual variety” category. In fact, the archetypical song structure described by Payne 
and McVay (1971) can be considered a specific case of eventual variety singing, in which 
a singer utters several repetitions of each phrase in his repertoire, passing from one type 
to another in an invariable order, until he has sung his complete repertoire, and then 
starts over again without pause. Variations in this paradigm have confounded several 
investigators and analyses since the original description, and may be better accommo- 
dated when considered in the larger paradigm of “eventual variety” singing behavior. 

Avian studies that address eventual variety songsters may include analyses of vari- 
ables such as degree of repetitivity and switching rate—factors shown to be impor- 
tant in interindividual interactions (Molles and Vehrencamp 1999, Molles 2006). 
Studies of species with more complex song systems incorporate many additional tech- 
niques to evaluate song organization, such as syllable organization (Catchpole 1976) 
or Euclidean distance between songs (Sorjonen 1987). In contrast, most humpback 
song studies considering the biological significance of song sequence have focused on 
the broadest level, examining features such as “song duration” (ż.e., duration of the 
entire thematic sequence). 


The Problem with Measuring “Song Duration” 


Besides limiting the scope of our analyses, why is the present use of the hierarchical 
level of humpback “song” a problem? Consider the case in which the duration of a 
“song” is used as a quantitative metric to evaluate the impact of potential acoustic 
disturbance to singing whales (e.g., low-frequency active sonar experiments: Miller 
et al. 2000, Fristrup et al. 2003). Fristrup eż al. (2003) used a straightforward method 
to delineate sequential songs: a song was measured as the interval between successive 
starts of a particular theme (that which was traditionally associated with surfacing), 
without respect to theme order beyond this “marker” theme. The authors measured 
“song duration” ranging from 5.5 min to over 33 min in length, and conclude that 
although songs in general increased in duration in response to the acoustic distur- 
bance, song length was highly variable. Another study (Miller et al. 2000) generally 
agreed that singing humpbacks increased the length of their songs in relation to 
broadcasts of low-frequency active sonar. Conversely, in a separate study, Sousa-Lima 
et al. (2002) used the number of phrases per theme to assess differences in song length 
between songs of males before and during boats approaches, and found that song 
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length decreased in response to acoustic disturbance. Given that these studies were all 
trying to evaluate the potential impact of anthropogenic activity on singing behavior, 
one might question how to interpret their opposing results. 

Interpretation of such changes in song length relative to a stimulus is strongly 
dependent on the variability of theme order and occurrence in the sequence of 
themes defined as a song. If theme order is invariant and all themes occur in each 
cycle of the “song,” then a measure of “song duration” as currently defined can be 
informative, because each repetition through the cycle includes the same number of 
themes. We can predict that longer songs are achieved in one of several ways 
(e.g., by increasing the length of particular units or phrases, by increasing the num- 
ber of phrase repetitions in the theme, efc.). Therefore, it is possible that a “longer 
song” is the result of singers increasing the number of repetitions of phrases within 
(a) theme(s). Conversely, if theme order and occurrence is variable among consecu- 
tive “songs,” then it is also possible that males might sing a more “erratic” song, in 
which they switch back and forth before completing a cycle (ż.e., returning to the 
theme arbitrarily chosen as the song “beginning” without uttering all themes in a 
sequence). Knowing what type of response led to an increase in “song duration,” as 
well as knowing what constituted a typical “song,” would be critical in interpreting 
the mechanism of the singers’ reaction. Biologically, these differences suggest dis- 
tinct responses to acoustic interference, but none of the studies presented data on 
variability in thematic composition, nor attempted to quantify the underlying 
reasons for the large variation or observed change in “song” duration. 

In general, when theme order is less stable, consistently defining a sequence that 
comprises a “song,” and consequently measuring “song length,” becomes difficult. 
Consider the following example of an actual theme sequence recorded in Brazil 2002 
(Sousa-Lima 2007), when six themes were clearly identified: 


...1234124546123465461234546121246... 


We may arbitrarily set the beginning of the song to “Theme 1.” If we delineate 
songs in this sample based on the requirement that a “song” includes a complete ren- 
dition of all themes, our song sequence would be as follows: 


...1234124546/12346546/1234546/121246. .. 


If instead we used Frumhoffs (1983) definition, our “song” could be categorized 
by the repetition of themes 1-2-3 (since this pattern is repeated more than twice in 
the song session), perhaps leading to the following: 


. . .1234124546/12346546/1234546121246. .. 


Or, following the precedent of the Fristrup eż al. (2003) study, we could start a 
new song each time we encounter Theme 1. Thus our song sequence would be as fol- 
lows: 


. . .1234/124546/12346546/1234546/12/1246. .. 


Although each of the above examples uses the exact same thematic sequence, dif- 
ferent methods of delineating song structure (each published and utilized in different 
studies) lead to very different results in terms of the number of “songs” in the 
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sequence, in the measured duration of each song, and in the thematic composition 
and order in each song. As a result, attempts to derive biologically relevant interpre- 
tations from analyses at the “song” level become difficult at best, and interpretation 
of the results is impossible without description of the variability of theme order. This 
real example demonstrates that the definition of “song,” and consequently the metric 
of “song duration,” as currently measured for humpback whales is in some cases prob- 
lematic, and raises the question as to the biological interpretation of analyses con- 
ducted at this hierarchical level. 

This is not to say that measures of song cycles, as classically defined for humpback 
whales, are without significance. When theme order is invariant, dividing sequences 
into song cycles for analyses may be relevant because the “songs” being compared are 
true repeats of the same pattern. And clearly, in at least some years, male humpback 
whales do seem to adhere to a hierarchical sequence at this level, which seems likely 
to be biologically relevant. However, when individuals incorporate a larger degree of 
variation into their thematic sequence, trying to force this variation into a structure 
that is defined by ordered repetition is not appropriate. The outcome is a series of 
arbitrarily delineated “songs,” none of which is like the other. 

A final note on song designation: while breathing intervals have sometimes been 
used as a “convenient way to define a beginning and end” of a song (Winn eż al. 
1970, Payne and Payne 1985), our own observations as well as other studies (Winn 
and Winn 1978, Winn eż al. 1981), document individuals breathing during different 
themes, so the temptation to use breathing cycle as a measure of “song length” should 


be avoided. 


Application of Songbird Metrics to Humpback Whale Song 


From a biological perspective, the degree of variation in theme order may be infor- 
mative. However, many researchers choose to assess effects at the song cycle level, and 
these potentially relevant differences in singing behavior are consequently ignored. 
Regrettably, this is likely limiting our understanding of the use of song within the 
humpback mating system. Instead, if we expand our conceptual framework to learn 
from studies of song in other taxa, we can consider humpback whales as “eventual 
variety” singers, and apply established methods for assessing variation and effects 
within this different paradigm. Some authors have started to move towards quantify- 
ing song structure at the level of phrases or repetitions of phrases (Eriksen et a/. 2005, 
Tougaard and Eriksen 2006, Cholewiak 2008). 

Recent work (Cholewiak 2008) to apply some of the avian song metrics to analy- 
ses of humpback whale song have found that males do vary their song presentation 
in ways that are quantifiable on the level of phrase-based analyses. For example, an 
analysis of switching rate found that males significantly increase the rate at which 
they switch between different phrase types in the presence of a second singer, as 
compared to when they are alone (Cholewiak 2008). This type of analysis provides 
different and potentially more revealing information than may be obtained, for 
example from traditional transition matrices or measures of “song” duration. The 
use of similar metrics in songbirds has revealed that changes in these variables may, 
for example, indicate an escalation of aggression between males (Krebs eż al. 1981, 
Vehrencamp 2001). Further applications of these types of methods to humpback 
whale songs, such as using quantitative metrics to measure the degree of repetition 
over time rather than across song cycles, may be important in advancing our under- 
standing of the complex nature of humpback whale vocal behavior. 
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CONCLUSIONS AND RECOMMENDATIONS 


Investigations of humpback whale song have spanned over four decades, yet there 
are still many unanswered questions regarding the degree of variation within and 
between individuals, the ways in which changes in song structure are transmitted 
over space and time, and the role of song within the breeding system. Pioneering 
work in the 1970s and 1980s developed a framework within which to analyze and 
understand humpback song structure, which was later reinforced and expanded by 
further studies. Humpback whale song is a constantly changing phenomenon, which 
has captured the interest of many students of animal behavior, while also presenting 
challenges with respect to qualitative and quantitative analyses. 

At the lower hierarchical levels within the humpback song framework, traditional 
definitions have worked well to identify song elements for analyses. At the middle 
levels, however, delineation of “phrases” and “themes” has been complicated both by 
the lack of well-defined time intervals between repetitive sequences, as well as by the 
ever-changing nature of song features. At the highest hierarchical level within this 
framework, the application of song cycle measures has obscured the degree of varia- 
tion in theme order that exists between males, and has made interpretation and com- 
parison of studies using different classification schemes problematic at best. While 
individuals have been shown to often sing themes in a relatively fixed order, it 
appears that in some regions (and/or time periods) there is variation in the consistency 
with which individual males adhere to a strict sequence. In the majority of studies to 
date, the potential biological relevance of this variability has been overlooked, because 
it is difficult to quantify within the traditional framework. In these situations, we 
suggest that the measures at the level of “song” (as a fixed sequence of themes) be used 
with extreme caution, and that alternative analyses be considered. In light of the bur- 
geoning body of work on humpback whale singing behavior, we hope that this 
review will enable more consistent, meaningful comparisons across studies. 


Recommendations 


Given the analytical approaches that have become common in studies of humpback 
whale song, and our assessment of the inherent variation in song behavior and associ- 
ated problems, we make the following recommendations for studies of humpback 
whale song patterns: 


(1) Maintaining the use of “units” and “subunits” in song description and measure- 
ments. 

(2) Use of a consistent set of criteria for delineation of phrases, as described in this 
review. 

(3) Maintaining and using vocabulary describing different theme types, as described 
in Payne and Payne (1985). 

(4) Abandonment of the classical use of the “song duration” metric as a response 
variable, and except in the specific case in which theme order and occurrence 
within and between singers is demonstrated to be invariable or nearly invari- 
able. 

(5) Adoption of the “phrase” as the salient level of repetition, as analogous to a bird 
“song” in the avian literature, and consequent exploration of analysis approaches 
that focus on sequences of phrases, following the birdsong paradigm for analysis 
of song sequences. 
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